Data Resampling for Path Based Clustering

نویسندگان

  • Bernd Fischer
  • Joachim M. Buhmann
چکیده

Path Based Clustering assigns two objects to the same cluster if they are connected by a path with high similarity between adjacent objects on the path. In this paper, we propose a fast agglomerative algorithm to minimize the Path Based Clustering cost function. To enhance the reliability of the clustering results a stochastic resampling method is used to generate candidate solutions which are merged to yield empirical assignment probabilities of objects to clusters. The resampling algorithm measures the reliability of the clustering solution and, based on their stability, determines the number of clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bagging for Path-Based Clustering

A resampling scheme for clustering with similarity to bootstrap aggregation (bagging) is presented. Bagging is used to improve the quality of pathbased clustering, a data clustering method that can extract elongated structures from data in a noise robust way. The results of an agglomerative optimization method are influenced by small fluctuations of the input data. To increase the reliability o...

متن کامل

Resampling-based selective clustering ensembles

Traditional clustering ensembles methods combine all obtained clustering results at hand. However, we observe that it can often achieve a better clustering solution if only part of all available clustering results are combined. This paper proposes a novel clustering ensembles method, termed as resampling-based selective clustering ensembles method. The proposed selective clustering ensembles me...

متن کامل

Resampling Method for Unsupervised Estimation of Cluster Validity

We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A figure of merit that measures the stability of clustering solutions against resampling is introduced. Clusters that are stable against resampling give rise to local maxima of this figure of merit. This is presented first for a one-dimensional data set,...

متن کامل

Resampling for Fuzzy Clustering

Resampling methods are among the best approaches to determine the number of clusters in prototype-based clustering. The core idea is that with the right choice for the number of clusters basically the same cluster structures should be obtained from subsamples of the given data set, while a wrong choice should produce considerably varying cluster structures. In this paper I give a brief overview...

متن کامل

Resampling Method For UnsupervisedEstimation Of Cluster

We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A gure of merit that measures the stability of clustering solutions against resampling is introduced. Clusters which are stable against resam-pling give rise to local maxima of this gure of merit. This is presented rst for a one-dimensional data set, for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002